{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Monte Carlo Simulation and Linear Algebra" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" }, "tags": [ "remove-cell" ] }, "source": [ "**CS1302 Introduction to Computer Programming**\n", "___" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "ExecuteTime": { "end_time": "2020-11-27T11:20:04.656873Z", "start_time": "2020-11-27T11:20:04.651575Z" }, "slideshow": { "slide_type": "fragment" }, "tags": [ "remove-cell" ] }, "outputs": [], "source": [ "import math\n", "import random\n", "import ipywidgets as widgets\n", "%reload_ext mytutor" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Monte Carlo simulation" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**What is Monte Carlo simulation?**" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ " > The name Monte Carlo refers to the [Monte Carlo Casino in Monaco](https://en.wikipedia.org/wiki/Monte_Carlo_Casino) where Ulam's uncle would borrow money from relatives to gamble." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "It would be nice to simulate the casino, so Ulam's uncle did not need to borrow money to play in the casino. \n", "Actually...,\n", "- Monte Carlo is the code name of the secret project for creating the [hydrogen bomb](https://en.wikipedia.org/wiki/Monte_Carlo_method). \n", "- [Ulam](https://en.wikipedia.org/wiki/Stanislaw_Ulam) worked with [John von Neumann](https://en.wikipedia.org/wiki/John_von_Neumann) to program the first electronic computer ENIAC to simulate a computational model of a thermonuclear reaction.\n", "\n", "(See also [The Beginning of the Monte Carlo Method](https://permalink.lanl.gov/object/tr?what=info:lanl-repo/lareport/LA-UR-88-9067) for a more detailed account.)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**How to compute the value of $\\pi$**?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "An application of Monte Carlo simulation is in approximating $\\pi$ using \n", "the [Buffon's needle](https://en.wikipedia.org/wiki/Buffon%27s_needle_problem). \n", "There is [a program](https://www.khanacademy.org/computer-programming/pi-by-buffons-needle/6695500989890560) written in javascript to do this." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The javascript program is a bit long to digest, so we will use an alternative simulation that is easier to understand/program." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "If we uniformly randomly pick a point in a square. What is the chance it is in the inscribed circle, i.e., the biggest circle inside the square?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" } }, "source": [ "The chance is the area of the circle divided by the area of the square. Suppose the square has length $\\ell$, then the chance is\n", "\n", "$$ \\frac{\\pi (\\ell /2)^2}{ (\\ell)^2 } = \\frac{\\pi}4 $$\n", "independent of the length $\\ell$." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**Exercise** Complete the following function to return an approximation of $\\pi$ as follows:\n", "1. Simulate the random process of picking a point from a square repeatedly `n` times by \n", " generating the $x$ and $y$ coordinates uniformly randomly from a unit interval $[0,1)$.\n", "2. Compute the fraction of times the point is in the first quadrant of the inscribed circle as shown in the figure below.\n", "3. Return $4$ times the fraction as the approximation.\n", "

\"Pi

" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "ExecuteTime": { "end_time": "2020-11-17T05:05:07.672010Z", "start_time": "2020-11-17T05:05:04.428686Z" }, "nbgrader": { "grade": false, "grade_id": "approximate_pi", "locked": false, "schema_version": 3, "solution": true, "task": false }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Approximate: 3.1424256\n", "Ground truth: 3.141592653589793\n" ] } ], "source": [ "def approximate_pi(n):\n", " ### BEGIN SOLUTION\n", " return (\n", " 4\n", " * len(\n", " [True for i in range(n) if random.random() ** 2 + random.random() ** 2 < 1]\n", " )\n", " / n\n", " )\n", " ### END SOLUTION\n", "print(f\"Approximate: {approximate_pi(int(1e7))}\\nGround truth: {math.pi}\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**How accurate is the approximation?**" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The following uses a powerful library `numpy` for computing to return a [$95\\%$-confidence interval](http://onlinestatbook.com/2/estimation/mean.html#:~:text=To%20compute%20the%2095%25%20confidence,be%20between%20the%20cutoff%20points.)." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "ExecuteTime": { "end_time": "2020-11-17T05:05:08.218178Z", "start_time": "2020-11-17T05:05:07.673752Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "95%-confidence interval: [3.14129292 3.14336948]\n", "Estimate: 3.1423 ± 0.0010\n", "Ground truth: 3.141592653589793\n" ] } ], "source": [ "import numpy as np\n", "\n", "\n", "def np_approximate_pi(n):\n", " in_circle = (np.random.random((n, 2)) ** 2).sum(axis=-1) < 1\n", " mean = 4 * in_circle.mean()\n", " std = 4 * in_circle.std() / n ** 0.5\n", " return np.array([mean - 2 * std, mean + 2 * std])\n", "\n", "\n", "interval = np_approximate_pi(int(1e7))\n", "print(\n", " f\"\"\"95%-confidence interval: {interval}\n", "Estimate: {interval.mean():.4f} ± {(interval[1]-interval[0])/2:.4f}\n", "Ground truth: {math.pi}\"\"\"\n", ")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Note that the computation done using `numpy` is over $5$ times faster despite the additional computation of the standard deviation." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "There are faster methods to approximate $\\pi$ such as the [Chudnovsky_algorithm](https://en.wikipedia.org/wiki/Chudnovsky_algorithm), but Monte-Carlo method is still useful in more complicated situations. \n", "E.g., see the Monte Carlo simulation of a [real-life situation](https://www.youtube.com/watch?v=-fCVxTTAtFQ) in playing basketball:\n", "> \"When down by three and left with only 30 seconds is it better to attempt a hard 3-point shot or an easy 2-point shot and get another possession?\" --LeBron James" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Linear Algebra" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**How to solve a linear equation?**" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Given the following linear equation in variable $x$ with real-valued coefficient $a$ and $b$,\n", "\n", "$$ a x = b,$$\n", "what is the value of $x$ that satisfies the equation?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**Exercise** Complete the following function to return either the unique solution of $x$ or `None` if a unique solution does not exist." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "ExecuteTime": { "end_time": "2020-11-17T05:05:08.250889Z", "start_time": "2020-11-17T05:05:08.220095Z" }, "nbgrader": { "grade": false, "grade_id": "solve_linear_equation", "locked": false, "schema_version": 3, "solution": true, "task": false }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "706ee6de5f904f17a4dfacd90c0d4d22", "version_major": 2, "version_minor": 0 }, "text/plain": [ "interactive(children=(IntSlider(value=2, description='a', max=5), IntSlider(value=3, description='b', max=5), …" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "def solve_linear_equation(a, b):\n", " ### BEGIN SOLUTION\n", " return b / a if a != 0 else None\n", " ### END SOLUTION\n", "\n", "\n", "@widgets.interact(a=(0, 5, 1), b=(0, 5, 1))\n", "def linear_equation_solver(a=2, b=3):\n", " print(\n", " f\"\"\"linear equation: {a}x = {b}\n", " solution: x = {solve_linear_equation(a,b)}\"\"\"\n", " )" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**How to solve multiple linear equations?**" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "In the general case, we have a system of $m$ linear equations and $n$ variables:\n", "\n", "$$ \\begin{aligned}\n", "a_{00} x_0 + a_{01} x_1 + \\dots + a_{0(n-1)} x_{n-1} &= b_0\\\\\n", "a_{10} x_0 + a_{11} x_1 + \\dots + a_{1(n-1)} x_{n-1} &= b_1\\\\\n", "\\vdots\\kern2em &= \\vdots\\\\\n", "a_{(m-1)0} x_0 + a_{(m-1)1} x_1 + \\dots + a_{(m-1)(n-1)} x_{n-1} &= b_{m-1}\\\\\n", "\\end{aligned}\n", "$$\n", "where\n", "- $x_j$ for $j\\in \\{0,\\dots,n-1\\}$ are the variables, and\n", "- $a_{ij}$ and $b_j$ for $i\\in \\{0,\\dots,m-1\\}$ and $j\\in \\{0,\\dots,n-1\\}$ are the coefficients.\n", "\n", "A fundamental problem in linear algebra is to compute the unique solution to the system if it exists." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We will consider the simpler 2-by-2 system with 2 variables and 2 equations:\n", "\n", "$$ \\begin{aligned}\n", "a_{00} x_0 + a_{01} x_1 &= b_0\\\\\n", "a_{10} x_0 + a_{11} x_1 &= b_1.\n", "\\end{aligned}\n", "$$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "To get an idea of the solution, suppose \n", "\n", "$$a_{00}=a_{11}=1, a_{01} = a_{10} = 0.$$\n", "The system of equations becomes\n", "\n", "$$ \\begin{aligned}\n", "x_0 \\hphantom{+ x_1} &= b_0\\\\\n", "\\hphantom{x_0 +} x_1 &= b_1,\n", "\\end{aligned}\n", "$$\n", "which gives the solution directly." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "What about $a_{00}=a_{11}=2$ instead?\n", "\n", "$$ \\begin{aligned}\n", "2x_0 \\hphantom{+ x_1} &= b_0\\\\\n", "\\hphantom{2x_0 +} 2x_1 &= b_1,\n", "\\end{aligned}$$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "To obtain the solution, we simply divide both equations by 2:\n", "\n", "$$ \\begin{aligned}\n", "x_0 \\hphantom{+ x_1} &= \\frac{b_0}2\\\\\n", "\\hphantom{x_0 +} x_1 &= \\frac{b_1}2.\n", "\\end{aligned}\n", "$$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "What if $a_{01}=2$ instead?\n", "\n", "$$ \\begin{aligned}\n", "2x_0 + 2x_1 &= b_0\\\\\n", "\\hphantom{2x_0 +} 2x_1 &= b_1\\\\\n", "\\end{aligned}\n", "$$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The second equation gives the solution of $x_1$, and we can use the solution in the first equation to solve for $x_0$. More precisely:\n", "- Subtract the second equation from the first one:\n", "\n", "$$ \\begin{aligned}\n", "2x_0 \\hphantom{+2x_1} &= b_0 - b_1\\\\\n", "\\hphantom{2x_0 +} 2x_1 &= b_1\\\\\n", "\\end{aligned}\n", "$$\n", "- Divide both equation by 2:\n", "\n", "$$ \\begin{aligned}\n", "x_0 \\hphantom{+ x_1} &= \\frac{b_0 - b_1}2\\\\\n", "\\hphantom{x_0 +} x_1 &= \\frac{b_1}2\\\\\n", "\\end{aligned}\n", "$$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The above operations are called *row operations* in linear algebra: each row is an equation. \n", "A system of linear equations can be solved by the linear operations of \n", "1. multiplying an equation by a constant, and\n", "2. subtracting one equation from another." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "How to write a program to solve a general 2-by-2 system? We will use the `numpy` library." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Creating `numpy` arrays" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**How to store the coefficients?**" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "In linear algebra, a system of equations such as\n", "\n", "$$ \\begin{aligned}\n", "a_{00} x_0 + a_{01} x_1 &= b_0\\\\\n", "a_{10} x_0 + a_{11} x_1 &= b_1\n", "\\end{aligned}\n", "$$\n", "is written concisely in *matrix* form as $ \\mathbf{A} \\mathbf{x} = \\mathbf{b} $:\n", "\n", "$$\\overbrace{\\begin{bmatrix}\n", "a_{00} & a_{01}\\\\\n", "a_{10} & a_{11}\n", "\\end{bmatrix}}^{\\mathbf{A}}\n", "\\overbrace{\n", "\\begin{bmatrix}\n", "x_0\\\\\n", "x_1\n", "\\end{bmatrix}}\n", "^{\\mathbf{x}}\n", "= \\overbrace{\\begin{bmatrix}\n", "b_0\\\\\n", "b_1\n", "\\end{bmatrix}}^{\\mathbf{b}},\n", "$$\n", "where\n", "$ \\mathbf{A} \\mathbf{x}$ is the *matrix multiplication*\n", "\n", "$$ \\mathbf{A} \\mathbf{x} = \\begin{bmatrix}\n", "a_{00} x_0 + a_{01} x_1\\\\\n", "a_{10} x_0 + a_{11} x_1\n", "\\end{bmatrix}.\n", "$$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We say that $\\mathbf{A}$ is a [*matrix*](https://en.wikipedia.org/wiki/Matrix_(mathematics)) and its dimension/shape is $2$-by-$2$:\n", "- The first dimension/axis has size $2$. We also say that the matrix has $2$ rows.\n", "- The second dimension/axis has size $2$. We also say that the matrix has $2$ columns.\n", "$\\mathbf{x}$ and $\\mathbf{b}$ are called column vectors, which are matrices with one column." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Consider the example\n", "\n", "$$ \\begin{aligned}\n", "2x_0 + 2x_1 &= 1\\\\\n", "\\hphantom{2x_0 +} 2x_1 &= 1,\n", "\\end{aligned}$$\n", "\n", "or in matrix form with\n", "\n", "$$ \\begin{aligned}\n", "\\mathbf{A}&=\\begin{bmatrix}\n", "a_{00} & a_{01} \\\\\n", "a_{10} & a_{11} \n", "\\end{bmatrix} \n", "= \\begin{bmatrix}\n", "2 & 2 \\\\\n", "0 & 2 \n", "\\end{bmatrix}\\\\\n", "\\mathbf{b}&=\\begin{bmatrix}\n", "b_0\\\\\n", "b_1\n", "\\end{bmatrix} = \\begin{bmatrix}\n", "1\\\\\n", "1\n", "\\end{bmatrix}\\end{aligned}$$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Instead of using `list` to store the matrix, we will use a `numpy` array." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "ExecuteTime": { "end_time": "2020-11-17T05:05:08.258666Z", "start_time": "2020-11-17T05:05:08.252256Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "(array([[2., 2.],\n", " [0., 2.]]),\n", " array([1., 1.]))" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A = np.array([[2.0, 2], [0, 2]])\n", "b = np.array([1.0, 1])\n", "A, b" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Compared to `list`, `numpy` array is often more efficient and has more useful attributes." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "ExecuteTime": { "end_time": "2020-11-17T05:05:08.277037Z", "start_time": "2020-11-17T05:05:08.259940Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Common attributes:\n", " copy sort\n", "\n", "Array-specific attributes:\n", " dtype max reshape dot compress getfield argmax view conjugate strides swapaxes trace squeeze put cumprod imag nbytes nonzero fill real size item shape T round tolist cumsum itemset resize var std take tostring repeat choose base flat clip flags diagonal argmin dumps setfield partition ptp mean searchsorted sum data ravel transpose conj ctypes argsort setflags dump itemsize astype any tobytes ndim min newbyteorder prod all tofile flatten byteswap argpartition\n", "\n", "List-specific attributes:\n", " count remove clear index insert reverse append pop extend\n" ] } ], "source": [ "array_attributes = set(attr for attr in dir(np.array([])) if attr[0] != \"_\")\n", "list_attributes = set(attr for attr in dir(list) if attr[0] != \"_\")\n", "print(\"\\nCommon attributes:\\n\", *(array_attributes & list_attributes))\n", "print(\"\\nArray-specific attributes:\\n\", *(array_attributes - list_attributes))\n", "print(\"\\nList-specific attributes:\\n\", *(list_attributes - array_attributes))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The following attributes give the dimension/shape, number of dimensions, size, and data type." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "ExecuteTime": { "end_time": "2020-11-17T05:05:08.284772Z", "start_time": "2020-11-17T05:05:08.278614Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[2. 2.]\n", " [0. 2.]]\n", " shape: (2, 2)\n", " ndim: 2\n", " size: 4\n", " dtype: float64\n", " \n", "[1. 1.]\n", " shape: (2,)\n", " ndim: 1\n", " size: 2\n", " dtype: float64\n", " \n" ] } ], "source": [ "for array in A, b:\n", " print(\n", " f\"\"\"{array}\n", " shape: {array.shape}\n", " ndim: {array.ndim}\n", " size: {array.size}\n", " dtype: {array.dtype}\n", " \"\"\"\n", " )" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Note that the function `len` only returns the size of the first dimension:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "ExecuteTime": { "end_time": "2020-11-17T05:05:08.316025Z", "start_time": "2020-11-17T05:05:08.310303Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "assert A.shape[0] == len(A)\n", "len(A)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Unlike `list`, every `numpy` array has a data type. For efficient computation/storage, numpy implements different data types with different storage sizes:\n", "* integer: `int8`, `int16`, `int32`, `int64`, `uint8`, ...\n", "* float: `float16`, `float32`, `float64`, ...\n", "* complex: `complex64`, `complex128`, ...\n", "* boolean: `bool8`\n", "* Unicode: `string`\n", "* Object: `object`" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "E.g., `int64` is the 64-bit integer. Unlike `int`, `int64` has a range." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "ExecuteTime": { "end_time": "2020-11-17T05:05:08.580209Z", "start_time": "2020-11-17T05:05:08.528600Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "range: -9223372036854775808 to 9223372036854775807\n" ] }, { "ename": "OverflowError", "evalue": "Python int too large to convert to C long", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mOverflowError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m/tmp/ipykernel_861/1402166890.py\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0mget_ipython\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mrun_line_magic\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'pinfo'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m'np.int64'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34mf\"range: {np.int64(-2**63)} to {np.int64(2**63-1)}\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mint64\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m2\u001b[0m \u001b[0;34m**\u001b[0m \u001b[0;36m63\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# overflow error\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mOverflowError\u001b[0m: Python int too large to convert to C long" ] } ], "source": [ "np.int64?\n", "print(f\"range: {np.int64(-2**63)} to {np.int64(2**63-1)}\")\n", "np.int64(2 ** 63) # overflow error" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "We can use the `astype` method to convert the data type:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "ExecuteTime": { "end_time": "2020-11-17T05:05:08.690620Z", "start_time": "2020-11-17T05:05:08.683292Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[2 2]\n", " [0 2]] int64\n", "[[2. 2.]\n", " [0. 2.]] float32\n" ] } ], "source": [ "A_int64 = A.astype(int) # converts to int64 by default\n", "A_float32 = A.astype(np.float32) # converts to float32\n", "for array in A_int64, A_float32:\n", " print(array, array.dtype)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "We have to be careful about assigning items of different types to an array." ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "ExecuteTime": { "end_time": "2020-11-17T05:05:08.757719Z", "start_time": "2020-11-17T05:05:08.747989Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[1 2]\n", " [0 2]]\n", "[[0 2]\n", " [0 2]]\n" ] }, { "data": { "text/plain": [ "array([1., 1.])" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A_int64[0, 0] = 1\n", "print(A_int64)\n", "A_int64[0, 0] = 0.5\n", "print(A_int64) # intended assignment fails\n", "np.array([int(1), float(1)]) # will be all floating point numbers" ] }, { "cell_type": "markdown", "metadata": { "ExecuteTime": { "end_time": "2020-11-12T14:52:46.259630Z", "start_time": "2020-11-12T14:52:46.252370Z" }, "slideshow": { "slide_type": "subslide" } }, "source": [ "**Exercise** Create a heterogeneous `numpy` array to store both integers and strings:\n", "```Python\n", "[0, 1, 2, 'a', 'b', 'c']\n", "```\n", "*Hint:* There is an `numpy` data type called `object`." ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "ExecuteTime": { "end_time": "2020-11-17T05:05:08.771960Z", "start_time": "2020-11-17T05:05:08.760396Z" }, "nbgrader": { "grade": false, "grade_id": "hetero", "locked": false, "schema_version": 3, "solution": true, "task": false }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "array([0, 1, 2, 'a', 'b', 'c'], dtype=object)" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "\u001b[0;31mInit signature:\u001b[0m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mobject\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mDocstring:\u001b[0m \n", "The base class of the class hierarchy.\n", "\n", "When called, it accepts no arguments and returns a new featureless\n", "instance that has no instance attributes and cannot be given any.\n", "\u001b[0;31mType:\u001b[0m type\n", "\u001b[0;31mSubclasses:\u001b[0m type, weakref, weakcallableproxy, weakproxy, int, bytearray, bytes, list, NoneType, NotImplementedType, ...\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "np.object?\n", "### BEGIN SOLUTION\n", "heterogeneous_np_array = np.array([*range(3), *\"abc\"], dtype=object)\n", "### END SOLUTION\n", "heterogeneous_np_array" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Be careful when creating arrays of `tuple`/`list`:" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "ExecuteTime": { "end_time": "2020-11-17T05:05:08.873199Z", "start_time": "2020-11-17T05:05:08.863225Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[(1, 2) list([3, 4, 5])] \n", "shape: (2,) length: 2 size: 2\n", "[[1 2]\n", " [3 4]] \n", "shape: (2, 2) length: 2 size: 4\n" ] } ], "source": [ "for array in (\n", " np.array([(1, 2), [3, 4, 5]], dtype=object),\n", " np.array([(1, 2), [3, 4]], dtype=object),\n", "):\n", " print(array, \"\\nshape:\", array.shape, \"length:\", len(array), \"size:\", array.size)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "`numpy` provides many functions to create an array:" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "ExecuteTime": { "end_time": "2020-11-17T05:05:08.983801Z", "start_time": "2020-11-17T05:05:08.973650Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "(array([], dtype=float64),\n", " array([0.]),\n", " array([[[0., 0., 0., 0.],\n", " [0., 0., 0., 0.],\n", " [0., 0., 0., 0.]],\n", " \n", " [[0., 0., 0., 0.],\n", " [0., 0., 0., 0.],\n", " [0., 0., 0., 0.]]]))" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "\u001b[0;31mDocstring:\u001b[0m\n", "zeros(shape, dtype=float, order='C', *, like=None)\n", "\n", "Return a new array of given shape and type, filled with zeros.\n", "\n", "Parameters\n", "----------\n", "shape : int or tuple of ints\n", " Shape of the new array, e.g., ``(2, 3)`` or ``2``.\n", "dtype : data-type, optional\n", " The desired data-type for the array, e.g., `numpy.int8`. Default is\n", " `numpy.float64`.\n", "order : {'C', 'F'}, optional, default: 'C'\n", " Whether to store multi-dimensional data in row-major\n", " (C-style) or column-major (Fortran-style) order in\n", " memory.\n", "like : array_like\n", " Reference object to allow the creation of arrays which are not\n", " NumPy arrays. If an array-like passed in as ``like`` supports\n", " the ``__array_function__`` protocol, the result will be defined\n", " by it. In this case, it ensures the creation of an array object\n", " compatible with that passed in via this argument.\n", "\n", " .. note::\n", " The ``like`` keyword is an experimental feature pending on\n", " acceptance of :ref:`NEP 35 `.\n", "\n", " .. versionadded:: 1.20.0\n", "\n", "Returns\n", "-------\n", "out : ndarray\n", " Array of zeros with the given shape, dtype, and order.\n", "\n", "See Also\n", "--------\n", "zeros_like : Return an array of zeros with shape and type of input.\n", "empty : Return a new uninitialized array.\n", "ones : Return a new array setting values to one.\n", "full : Return a new array of given shape filled with value.\n", "\n", "Examples\n", "--------\n", ">>> np.zeros(5)\n", "array([ 0., 0., 0., 0., 0.])\n", "\n", ">>> np.zeros((5,), dtype=int)\n", "array([0, 0, 0, 0, 0])\n", "\n", ">>> np.zeros((2, 1))\n", "array([[ 0.],\n", " [ 0.]])\n", "\n", ">>> s = (2,2)\n", ">>> np.zeros(s)\n", "array([[ 0., 0.],\n", " [ 0., 0.]])\n", "\n", ">>> np.zeros((2,), dtype=[('x', 'i4'), ('y', 'i4')]) # custom dtype\n", "array([(0, 0), (0, 0)],\n", " dtype=[('x', '`.\n", "\n", " .. versionadded:: 1.20.0\n", "\n", "Returns\n", "-------\n", "out : ndarray\n", " Array of ones with the given shape, dtype, and order.\n", "\n", "See Also\n", "--------\n", "ones_like : Return an array of ones with shape and type of input.\n", "empty : Return a new uninitialized array.\n", "zeros : Return a new array setting values to zero.\n", "full : Return a new array of given shape filled with value.\n", "\n", "\n", "Examples\n", "--------\n", ">>> np.ones(5)\n", "array([1., 1., 1., 1., 1.])\n", "\n", ">>> np.ones((5,), dtype=int)\n", "array([1, 1, 1, 1, 1])\n", "\n", ">>> np.ones((2, 1))\n", "array([[1.],\n", " [1.]])\n", "\n", ">>> s = (2,2)\n", ">>> np.ones(s)\n", "array([[1., 1.],\n", " [1., 1.]])\n", "\u001b[0;31mFile:\u001b[0m /opt/conda/lib/python3.8/site-packages/numpy/core/numeric.py\n", "\u001b[0;31mType:\u001b[0m function\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "np.ones?\n", "np.ones(0, dtype=int), np.ones((2, 3, 4), dtype=int) # initialize values to int 1" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "ExecuteTime": { "end_time": "2020-11-17T05:05:09.012964Z", "start_time": "2020-11-17T05:05:09.000599Z" } }, "outputs": [ { "data": { "text/plain": [ "(array([], shape=(0, 0), dtype=float64),\n", " array([[1.]]),\n", " array([[1., 0.],\n", " [0., 1.]]),\n", " array([[1., 0., 0.],\n", " [0., 1., 0.],\n", " [0., 0., 1.]]))" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "\u001b[0;31mSignature:\u001b[0m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0meye\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mN\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mM\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mk\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mdtype\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m<\u001b[0m\u001b[0;32mclass\u001b[0m \u001b[0;34m'float'\u001b[0m\u001b[0;34m>\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0morder\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m'C'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m*\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mlike\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mDocstring:\u001b[0m\n", "Return a 2-D array with ones on the diagonal and zeros elsewhere.\n", "\n", "Parameters\n", "----------\n", "N : int\n", " Number of rows in the output.\n", "M : int, optional\n", " Number of columns in the output. If None, defaults to `N`.\n", "k : int, optional\n", " Index of the diagonal: 0 (the default) refers to the main diagonal,\n", " a positive value refers to an upper diagonal, and a negative value\n", " to a lower diagonal.\n", "dtype : data-type, optional\n", " Data-type of the returned array.\n", "order : {'C', 'F'}, optional\n", " Whether the output should be stored in row-major (C-style) or\n", " column-major (Fortran-style) order in memory.\n", "\n", " .. versionadded:: 1.14.0\n", "like : array_like\n", " Reference object to allow the creation of arrays which are not\n", " NumPy arrays. If an array-like passed in as ``like`` supports\n", " the ``__array_function__`` protocol, the result will be defined\n", " by it. In this case, it ensures the creation of an array object\n", " compatible with that passed in via this argument.\n", "\n", " .. note::\n", " The ``like`` keyword is an experimental feature pending on\n", " acceptance of :ref:`NEP 35 `.\n", "\n", " .. versionadded:: 1.20.0\n", "\n", "Returns\n", "-------\n", "I : ndarray of shape (N,M)\n", " An array where all elements are equal to zero, except for the `k`-th\n", " diagonal, whose values are equal to one.\n", "\n", "See Also\n", "--------\n", "identity : (almost) equivalent function\n", "diag : diagonal 2-D array from a 1-D array specified by the user.\n", "\n", "Examples\n", "--------\n", ">>> np.eye(2, dtype=int)\n", "array([[1, 0],\n", " [0, 1]])\n", ">>> np.eye(3, k=1)\n", "array([[0., 1., 0.],\n", " [0., 0., 1.],\n", " [0., 0., 0.]])\n", "\u001b[0;31mFile:\u001b[0m /opt/conda/lib/python3.8/site-packages/numpy/lib/twodim_base.py\n", "\u001b[0;31mType:\u001b[0m function\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "np.eye?\n", "np.eye(0), np.eye(1), np.eye(2), np.eye(3) # identity matrices" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "ExecuteTime": { "end_time": "2020-11-17T05:05:09.028409Z", "start_time": "2020-11-17T05:05:09.014926Z" } }, "outputs": [ { "data": { "text/plain": [ "(array([[0]]),\n", " array([[0, 0],\n", " [0, 1]]),\n", " array([[0., 1., 0., 0.],\n", " [0., 0., 1., 0.],\n", " [0., 0., 0., 1.],\n", " [0., 0., 0., 0.]]))" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "\u001b[0;31mSignature:\u001b[0m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdiag\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mv\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mk\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mDocstring:\u001b[0m\n", "Extract a diagonal or construct a diagonal array.\n", "\n", "See the more detailed documentation for ``numpy.diagonal`` if you use this\n", "function to extract a diagonal and wish to write to the resulting array;\n", "whether it returns a copy or a view depends on what version of numpy you\n", "are using.\n", "\n", "Parameters\n", "----------\n", "v : array_like\n", " If `v` is a 2-D array, return a copy of its `k`-th diagonal.\n", " If `v` is a 1-D array, return a 2-D array with `v` on the `k`-th\n", " diagonal.\n", "k : int, optional\n", " Diagonal in question. The default is 0. Use `k>0` for diagonals\n", " above the main diagonal, and `k<0` for diagonals below the main\n", " diagonal.\n", "\n", "Returns\n", "-------\n", "out : ndarray\n", " The extracted diagonal or constructed diagonal array.\n", "\n", "See Also\n", "--------\n", "diagonal : Return specified diagonals.\n", "diagflat : Create a 2-D array with the flattened input as a diagonal.\n", "trace : Sum along diagonals.\n", "triu : Upper triangle of an array.\n", "tril : Lower triangle of an array.\n", "\n", "Examples\n", "--------\n", ">>> x = np.arange(9).reshape((3,3))\n", ">>> x\n", "array([[0, 1, 2],\n", " [3, 4, 5],\n", " [6, 7, 8]])\n", "\n", ">>> np.diag(x)\n", "array([0, 4, 8])\n", ">>> np.diag(x, k=1)\n", "array([1, 5])\n", ">>> np.diag(x, k=-1)\n", "array([3, 7])\n", "\n", ">>> np.diag(np.diag(x))\n", "array([[0, 0, 0],\n", " [0, 4, 0],\n", " [0, 0, 8]])\n", "\u001b[0;31mFile:\u001b[0m /opt/conda/lib/python3.8/site-packages/numpy/lib/twodim_base.py\n", "\u001b[0;31mType:\u001b[0m function\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "np.diag?\n", "np.diag(range(1)), np.diag(range(2)), np.diag(np.ones(3), k=1) # diagonal matrices" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "ExecuteTime": { "end_time": "2020-11-17T05:05:09.041080Z", "start_time": "2020-11-17T05:05:09.029954Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "(array([], dtype=float64),\n", " array([[[94753737787328, 0, 0, 0],\n", " [ 0, 0, 0, 0],\n", " [ 0, 0, 0, 0]],\n", " \n", " [[ 0, 0, 0, 0],\n", " [ 0, 0, 0, 0],\n", " [ 0, 0, 0, 0]]]))" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "\u001b[0;31mDocstring:\u001b[0m\n", "empty(shape, dtype=float, order='C', *, like=None)\n", "\n", "Return a new array of given shape and type, without initializing entries.\n", "\n", "Parameters\n", "----------\n", "shape : int or tuple of int\n", " Shape of the empty array, e.g., ``(2, 3)`` or ``2``.\n", "dtype : data-type, optional\n", " Desired output data-type for the array, e.g, `numpy.int8`. Default is\n", " `numpy.float64`.\n", "order : {'C', 'F'}, optional, default: 'C'\n", " Whether to store multi-dimensional data in row-major\n", " (C-style) or column-major (Fortran-style) order in\n", " memory.\n", "like : array_like\n", " Reference object to allow the creation of arrays which are not\n", " NumPy arrays. If an array-like passed in as ``like`` supports\n", " the ``__array_function__`` protocol, the result will be defined\n", " by it. In this case, it ensures the creation of an array object\n", " compatible with that passed in via this argument.\n", "\n", " .. note::\n", " The ``like`` keyword is an experimental feature pending on\n", " acceptance of :ref:`NEP 35 `.\n", "\n", " .. versionadded:: 1.20.0\n", "\n", "Returns\n", "-------\n", "out : ndarray\n", " Array of uninitialized (arbitrary) data of the given shape, dtype, and\n", " order. Object arrays will be initialized to None.\n", "\n", "See Also\n", "--------\n", "empty_like : Return an empty array with shape and type of input.\n", "ones : Return a new array setting values to one.\n", "zeros : Return a new array setting values to zero.\n", "full : Return a new array of given shape filled with value.\n", "\n", "\n", "Notes\n", "-----\n", "`empty`, unlike `zeros`, does not set the array values to zero,\n", "and may therefore be marginally faster. On the other hand, it requires\n", "the user to manually set all the values in the array, and should be\n", "used with caution.\n", "\n", "Examples\n", "--------\n", ">>> np.empty([2, 2])\n", "array([[ -9.74499359e+001, 6.69583040e-309],\n", " [ 2.13182611e-314, 3.06959433e-309]]) #uninitialized\n", "\n", ">>> np.empty([2, 2], dtype=int)\n", "array([[-1073741821, -1067949133],\n", " [ 496041986, 19249760]]) #uninitialized\n", "\u001b[0;31mType:\u001b[0m builtin_function_or_method\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "np.empty?\n", "np.empty(0), np.empty(\n", " (2, 3, 4), dtype=int\n", ") # create array faster without initialization" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "`numpy` also provides functions to build an array using rules." ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "ExecuteTime": { "end_time": "2020-11-17T05:05:09.109703Z", "start_time": "2020-11-17T05:05:09.099795Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "(array([0, 1, 2, 3, 4]), array([4]), array([4.5, 5. ]))" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "\u001b[0;31mDocstring:\u001b[0m\n", "arange([start,] stop[, step,], dtype=None, *, like=None)\n", "\n", "Return evenly spaced values within a given interval.\n", "\n", "Values are generated within the half-open interval ``[start, stop)``\n", "(in other words, the interval including `start` but excluding `stop`).\n", "For integer arguments the function is equivalent to the Python built-in\n", "`range` function, but returns an ndarray rather than a list.\n", "\n", "When using a non-integer step, such as 0.1, the results will often not\n", "be consistent. It is better to use `numpy.linspace` for these cases.\n", "\n", "Parameters\n", "----------\n", "start : integer or real, optional\n", " Start of interval. The interval includes this value. The default\n", " start value is 0.\n", "stop : integer or real\n", " End of interval. The interval does not include this value, except\n", " in some cases where `step` is not an integer and floating point\n", " round-off affects the length of `out`.\n", "step : integer or real, optional\n", " Spacing between values. For any output `out`, this is the distance\n", " between two adjacent values, ``out[i+1] - out[i]``. The default\n", " step size is 1. If `step` is specified as a position argument,\n", " `start` must also be given.\n", "dtype : dtype\n", " The type of the output array. If `dtype` is not given, infer the data\n", " type from the other input arguments.\n", "like : array_like\n", " Reference object to allow the creation of arrays which are not\n", " NumPy arrays. If an array-like passed in as ``like`` supports\n", " the ``__array_function__`` protocol, the result will be defined\n", " by it. In this case, it ensures the creation of an array object\n", " compatible with that passed in via this argument.\n", "\n", " .. note::\n", " The ``like`` keyword is an experimental feature pending on\n", " acceptance of :ref:`NEP 35 `.\n", "\n", " .. versionadded:: 1.20.0\n", "\n", "Returns\n", "-------\n", "arange : ndarray\n", " Array of evenly spaced values.\n", "\n", " For floating point arguments, the length of the result is\n", " ``ceil((stop - start)/step)``. Because of floating point overflow,\n", " this rule may result in the last element of `out` being greater\n", " than `stop`.\n", "\n", "See Also\n", "--------\n", "numpy.linspace : Evenly spaced numbers with careful handling of endpoints.\n", "numpy.ogrid: Arrays of evenly spaced numbers in N-dimensions.\n", "numpy.mgrid: Grid-shaped arrays of evenly spaced numbers in N-dimensions.\n", "\n", "Examples\n", "--------\n", ">>> np.arange(3)\n", "array([0, 1, 2])\n", ">>> np.arange(3.0)\n", "array([ 0., 1., 2.])\n", ">>> np.arange(3,7)\n", "array([3, 4, 5, 6])\n", ">>> np.arange(3,7,2)\n", "array([3, 5])\n", "\u001b[0;31mType:\u001b[0m builtin_function_or_method\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "np.arange?\n", "np.arange(5), np.arange(4, 5), np.arange(\n", " 4.5, 5.5, 0.5\n", ") # like range but allow non-integer parameters" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "ExecuteTime": { "end_time": "2020-11-17T05:05:09.124941Z", "start_time": "2020-11-17T05:05:09.112456Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "(array([4. , 4.02040816, 4.04081633, 4.06122449, 4.08163265,\n", " 4.10204082, 4.12244898, 4.14285714, 4.16326531, 4.18367347,\n", " 4.20408163, 4.2244898 , 4.24489796, 4.26530612, 4.28571429,\n", " 4.30612245, 4.32653061, 4.34693878, 4.36734694, 4.3877551 ,\n", " 4.40816327, 4.42857143, 4.44897959, 4.46938776, 4.48979592,\n", " 4.51020408, 4.53061224, 4.55102041, 4.57142857, 4.59183673,\n", " 4.6122449 , 4.63265306, 4.65306122, 4.67346939, 4.69387755,\n", " 4.71428571, 4.73469388, 4.75510204, 4.7755102 , 4.79591837,\n", " 4.81632653, 4.83673469, 4.85714286, 4.87755102, 4.89795918,\n", " 4.91836735, 4.93877551, 4.95918367, 4.97959184, 5. ]),\n", " array([4. , 4.1, 4.2, 4.3, 4.4, 4.5, 4.6, 4.7, 4.8, 4.9, 5. ]),\n", " array([4. , 4.1, 4.2, 4.3, 4.4, 4.5, 4.6, 4.7, 4.8, 4.9, 5. ]))" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "\u001b[0;31mSignature:\u001b[0m\n", "\u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlinspace\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mstart\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mstop\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mnum\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m50\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mendpoint\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mTrue\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mretstep\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mFalse\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mdtype\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0maxis\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mDocstring:\u001b[0m\n", "Return evenly spaced numbers over a specified interval.\n", "\n", "Returns `num` evenly spaced samples, calculated over the\n", "interval [`start`, `stop`].\n", "\n", "The endpoint of the interval can optionally be excluded.\n", "\n", ".. versionchanged:: 1.16.0\n", " Non-scalar `start` and `stop` are now supported.\n", "\n", ".. versionchanged:: 1.20.0\n", " Values are rounded towards ``-inf`` instead of ``0`` when an\n", " integer ``dtype`` is specified. The old behavior can\n", " still be obtained with ``np.linspace(start, stop, num).astype(int)``\n", "\n", "Parameters\n", "----------\n", "start : array_like\n", " The starting value of the sequence.\n", "stop : array_like\n", " The end value of the sequence, unless `endpoint` is set to False.\n", " In that case, the sequence consists of all but the last of ``num + 1``\n", " evenly spaced samples, so that `stop` is excluded. Note that the step\n", " size changes when `endpoint` is False.\n", "num : int, optional\n", " Number of samples to generate. Default is 50. Must be non-negative.\n", "endpoint : bool, optional\n", " If True, `stop` is the last sample. Otherwise, it is not included.\n", " Default is True.\n", "retstep : bool, optional\n", " If True, return (`samples`, `step`), where `step` is the spacing\n", " between samples.\n", "dtype : dtype, optional\n", " The type of the output array. If `dtype` is not given, the data type\n", " is inferred from `start` and `stop`. The inferred dtype will never be\n", " an integer; `float` is chosen even if the arguments would produce an\n", " array of integers.\n", "\n", " .. versionadded:: 1.9.0\n", "\n", "axis : int, optional\n", " The axis in the result to store the samples. Relevant only if start\n", " or stop are array-like. By default (0), the samples will be along a\n", " new axis inserted at the beginning. Use -1 to get an axis at the end.\n", "\n", " .. versionadded:: 1.16.0\n", "\n", "Returns\n", "-------\n", "samples : ndarray\n", " There are `num` equally spaced samples in the closed interval\n", " ``[start, stop]`` or the half-open interval ``[start, stop)``\n", " (depending on whether `endpoint` is True or False).\n", "step : float, optional\n", " Only returned if `retstep` is True\n", "\n", " Size of spacing between samples.\n", "\n", "\n", "See Also\n", "--------\n", "arange : Similar to `linspace`, but uses a step size (instead of the\n", " number of samples).\n", "geomspace : Similar to `linspace`, but with numbers spaced evenly on a log\n", " scale (a geometric progression).\n", "logspace : Similar to `geomspace`, but with the end points specified as\n", " logarithms.\n", "\n", "Examples\n", "--------\n", ">>> np.linspace(2.0, 3.0, num=5)\n", "array([2. , 2.25, 2.5 , 2.75, 3. ])\n", ">>> np.linspace(2.0, 3.0, num=5, endpoint=False)\n", "array([2. , 2.2, 2.4, 2.6, 2.8])\n", ">>> np.linspace(2.0, 3.0, num=5, retstep=True)\n", "(array([2. , 2.25, 2.5 , 2.75, 3. ]), 0.25)\n", "\n", "Graphical illustration:\n", "\n", ">>> import matplotlib.pyplot as plt\n", ">>> N = 8\n", ">>> y = np.zeros(N)\n", ">>> x1 = np.linspace(0, 10, N, endpoint=True)\n", ">>> x2 = np.linspace(0, 10, N, endpoint=False)\n", ">>> plt.plot(x1, y, 'o')\n", "[]\n", ">>> plt.plot(x2, y + 0.5, 'o')\n", "[]\n", ">>> plt.ylim([-0.5, 1])\n", "(-0.5, 1)\n", ">>> plt.show()\n", "\u001b[0;31mFile:\u001b[0m /opt/conda/lib/python3.8/site-packages/numpy/core/function_base.py\n", "\u001b[0;31mType:\u001b[0m function\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "np.linspace?\n", "np.linspace(4, 5), np.linspace(4, 5, 11), np.linspace(\n", " 4, 5, 11\n", ") # can specify number of points instead of step" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "ExecuteTime": { "end_time": "2020-11-17T05:05:09.133901Z", "start_time": "2020-11-17T05:05:09.126689Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "array([[0., 0., 0., 0.],\n", " [0., 1., 2., 3.],\n", " [0., 2., 4., 6.]])" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "\u001b[0;31mSignature:\u001b[0m\n", "\u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfromfunction\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mfunction\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mshape\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0;34m*\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mdtype\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m<\u001b[0m\u001b[0;32mclass\u001b[0m \u001b[0;34m'float'\u001b[0m\u001b[0;34m>\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mlike\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mDocstring:\u001b[0m\n", "Construct an array by executing a function over each coordinate.\n", "\n", "The resulting array therefore has a value ``fn(x, y, z)`` at\n", "coordinate ``(x, y, z)``.\n", "\n", "Parameters\n", "----------\n", "function : callable\n", " The function is called with N parameters, where N is the rank of\n", " `shape`. Each parameter represents the coordinates of the array\n", " varying along a specific axis. For example, if `shape`\n", " were ``(2, 2)``, then the parameters would be\n", " ``array([[0, 0], [1, 1]])`` and ``array([[0, 1], [0, 1]])``\n", "shape : (N,) tuple of ints\n", " Shape of the output array, which also determines the shape of\n", " the coordinate arrays passed to `function`.\n", "dtype : data-type, optional\n", " Data-type of the coordinate arrays passed to `function`.\n", " By default, `dtype` is float.\n", "like : array_like\n", " Reference object to allow the creation of arrays which are not\n", " NumPy arrays. If an array-like passed in as ``like`` supports\n", " the ``__array_function__`` protocol, the result will be defined\n", " by it. In this case, it ensures the creation of an array object\n", " compatible with that passed in via this argument.\n", "\n", " .. note::\n", " The ``like`` keyword is an experimental feature pending on\n", " acceptance of :ref:`NEP 35 `.\n", "\n", " .. versionadded:: 1.20.0\n", "\n", "Returns\n", "-------\n", "fromfunction : any\n", " The result of the call to `function` is passed back directly.\n", " Therefore the shape of `fromfunction` is completely determined by\n", " `function`. If `function` returns a scalar value, the shape of\n", " `fromfunction` would not match the `shape` parameter.\n", "\n", "See Also\n", "--------\n", "indices, meshgrid\n", "\n", "Notes\n", "-----\n", "Keywords other than `dtype` are passed to `function`.\n", "\n", "Examples\n", "--------\n", ">>> np.fromfunction(lambda i, j: i == j, (3, 3), dtype=int)\n", "array([[ True, False, False],\n", " [False, True, False],\n", " [False, False, True]])\n", "\n", ">>> np.fromfunction(lambda i, j: i + j, (3, 3), dtype=int)\n", "array([[0, 1, 2],\n", " [1, 2, 3],\n", " [2, 3, 4]])\n", "\u001b[0;31mFile:\u001b[0m /opt/conda/lib/python3.8/site-packages/numpy/core/numeric.py\n", "\u001b[0;31mType:\u001b[0m function\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "np.fromfunction?\n", "np.fromfunction(lambda i, j: i * j, (3, 4)) # can initialize using a function" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "We can also reshape an array using the `reshape` method/function:" ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "ExecuteTime": { "end_time": "2020-11-17T05:05:09.225964Z", "start_time": "2020-11-17T05:05:09.214048Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "(array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,\n", " 17, 18, 19, 20, 21, 22, 23]),\n", " array([[[ 0, 1, 2, 3],\n", " [ 4, 5, 6, 7],\n", " [ 8, 9, 10, 11]],\n", " \n", " [[12, 13, 14, 15],\n", " [16, 17, 18, 19],\n", " [20, 21, 22, 23]]]),\n", " array([[[ 0, 1, 2, 3],\n", " [ 4, 5, 6, 7],\n", " [ 8, 9, 10, 11]],\n", " \n", " [[12, 13, 14, 15],\n", " [16, 17, 18, 19],\n", " [20, 21, 22, 23]]]),\n", " array([[[ 0, 6, 12, 18],\n", " [ 2, 8, 14, 20],\n", " [ 4, 10, 16, 22]],\n", " \n", " [[ 1, 7, 13, 19],\n", " [ 3, 9, 15, 21],\n", " [ 5, 11, 17, 23]]]))" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "\u001b[0;31mDocstring:\u001b[0m\n", "a.reshape(shape, order='C')\n", "\n", "Returns an array containing the same data with a new shape.\n", "\n", "Refer to `numpy.reshape` for full documentation.\n", "\n", "See Also\n", "--------\n", "numpy.reshape : equivalent function\n", "\n", "Notes\n", "-----\n", "Unlike the free function `numpy.reshape`, this method on `ndarray` allows\n", "the elements of the shape parameter to be passed in as separate arguments.\n", "For example, ``a.reshape(10, 11)`` is equivalent to\n", "``a.reshape((10, 11))``.\n", "\u001b[0;31mType:\u001b[0m builtin_function_or_method\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "array = np.arange(2 * 3 * 4)\n", "array.reshape?\n", "(\n", " array,\n", " array.reshape(2, 3, 4), # last axis index changes fastest\n", " array.reshape(2, 3, -1), # size of last axis calculated automatically\n", " array.reshape((2, 3, 4), order=\"F\"),\n", ") # first axis index changes fastest" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "`ravel` is a special case of reshaping an array to one dimension. \n", "(As similar function `flatten` returns a copy of the array but `ravel` and `reshape` returns a dynamic view whenever possible.)" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "ExecuteTime": { "end_time": "2020-11-17T05:05:09.335182Z", "start_time": "2020-11-17T05:05:09.326890Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "(array([[[ 0, 1, 2, 3],\n", " [ 4, 5, 6, 7],\n", " [ 8, 9, 10, 11]],\n", " \n", " [[12, 13, 14, 15],\n", " [16, 17, 18, 19],\n", " [20, 21, 22, 23]]]),\n", " array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,\n", " 17, 18, 19, 20, 21, 22, 23]),\n", " array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,\n", " 17, 18, 19, 20, 21, 22, 23]),\n", " array([ 0, 12, 4, 16, 8, 20, 1, 13, 5, 17, 9, 21, 2, 14, 6, 18, 10,\n", " 22, 3, 15, 7, 19, 11, 23]))" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "array = np.arange(2 * 3 * 4).reshape(2, 3, 4)\n", "array, array.ravel(), array.reshape(-1), array.ravel(order=\"F\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**Exercise** Correct the following function to print every element of an array line-by-line.\n", "```Python\n", "def print_array_entries_line_by_line(array):\n", " for i in array:\n", " print(i)\n", "```" ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "ExecuteTime": { "end_time": "2020-11-17T05:05:15.307473Z", "start_time": "2020-11-17T05:05:15.296495Z" }, "nbgrader": { "grade": false, "grade_id": "flatten", "locked": false, "schema_version": 3, "solution": true, "task": false }, "slideshow": { "slide_type": "-" }, "tags": [ "remove-output" ] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0\n", "1\n", "2\n", "3\n", "4\n", "5\n", "6\n", "7\n", "8\n", "9\n", "10\n", "11\n", "12\n", "13\n", "14\n", "15\n", "16\n", "17\n", "18\n", "19\n", "20\n", "21\n", "22\n", "23\n" ] } ], "source": [ "def print_array_entries_line_by_line(array):\n", " ### BEGIN SOLUTION\n", " for i in array.flatten():\n", " print(i)\n", " ### END SOLUTION\n", "\n", "\n", "print_array_entries_line_by_line(np.arange(2 * 3 * 4).reshape(2, 3, 4))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Operating on `numpy` arrays" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**How to verify the solution of a system of linear equations?**" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Before solving the system of linear equations, let us try to verify a solution to the equations:\n", "\n", "$$ \\begin{aligned}\n", "2x_0 + 2x_1 &= 1\\\\\n", "\\hphantom{2x_0 +} 2x_1 &= 1\n", "\\end{aligned}\n", "$$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "`numpy` provides the function `matmul` and the operator `@` for matrix multiplication." ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "ExecuteTime": { "end_time": "2020-11-13T07:30:32.601700Z", "start_time": "2020-11-13T07:30:32.592763Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[False False]\n", "[ True True]\n" ] } ], "source": [ "print(np.matmul(A, np.array([0, 0])) == b)\n", "print(A @ np.array([0, 0.5]) == b)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Note that the comparison on `numpy` arrays returns a boolean array instead of a boolean value, unlike the comparison operations on lists." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "To check whether all items are true, we use the `all` method." ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "ExecuteTime": { "end_time": "2020-11-13T07:40:09.717569Z", "start_time": "2020-11-13T07:40:09.708770Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "False\n", "True\n" ] } ], "source": [ "print((np.matmul(A, np.array([0, 0])) == b).all())\n", "print((A @ np.array([0, 0.5]) == b).all())" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**How to concatenate arrays?**" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" } }, "source": [ "We will operate on an augmented matrix of the coefficients:\n", "\n", "$$ \\begin{aligned} \\mathbf{C} &= \\begin{bmatrix}\n", "\\mathbf{A} & \\mathbf{b}\n", "\\end{bmatrix}\\\\\n", "&= \\begin{bmatrix}\n", "a_{00} & a_{01} & b_0 \\\\\n", "a_{10} & a_{11} & b_1\n", "\\end{bmatrix} \n", "\\end{aligned}\n", "$$\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "`numpy` provides functions to create block matrices:" ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "ExecuteTime": { "end_time": "2020-11-13T05:59:47.144831Z", "start_time": "2020-11-13T05:59:47.133871Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "array([[2., 2., 1.],\n", " [0., 2., 1.]])" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "\u001b[0;31mSignature:\u001b[0m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mblock\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0marrays\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mDocstring:\u001b[0m\n", "Assemble an nd-array from nested lists of blocks.\n", "\n", "Blocks in the innermost lists are concatenated (see `concatenate`) along\n", "the last dimension (-1), then these are concatenated along the\n", "second-last dimension (-2), and so on until the outermost list is reached.\n", "\n", "Blocks can be of any dimension, but will not be broadcasted using the normal\n", "rules. Instead, leading axes of size 1 are inserted, to make ``block.ndim``\n", "the same for all blocks. This is primarily useful for working with scalars,\n", "and means that code like ``np.block([v, 1])`` is valid, where\n", "``v.ndim == 1``.\n", "\n", "When the nested list is two levels deep, this allows block matrices to be\n", "constructed from their components.\n", "\n", ".. versionadded:: 1.13.0\n", "\n", "Parameters\n", "----------\n", "arrays : nested list of array_like or scalars (but not tuples)\n", " If passed a single ndarray or scalar (a nested list of depth 0), this\n", " is returned unmodified (and not copied).\n", "\n", " Elements shapes must match along the appropriate axes (without\n", " broadcasting), but leading 1s will be prepended to the shape as\n", " necessary to make the dimensions match.\n", "\n", "Returns\n", "-------\n", "block_array : ndarray\n", " The array assembled from the given blocks.\n", "\n", " The dimensionality of the output is equal to the greatest of:\n", " * the dimensionality of all the inputs\n", " * the depth to which the input list is nested\n", "\n", "Raises\n", "------\n", "ValueError\n", " * If list depths are mismatched - for instance, ``[[a, b], c]`` is\n", " illegal, and should be spelt ``[[a, b], [c]]``\n", " * If lists are empty - for instance, ``[[a, b], []]``\n", "\n", "See Also\n", "--------\n", "concatenate : Join a sequence of arrays along an existing axis.\n", "stack : Join a sequence of arrays along a new axis.\n", "vstack : Stack arrays in sequence vertically (row wise).\n", "hstack : Stack arrays in sequence horizontally (column wise).\n", "dstack : Stack arrays in sequence depth wise (along third axis).\n", "column_stack : Stack 1-D arrays as columns into a 2-D array.\n", "vsplit : Split an array into multiple sub-arrays vertically (row-wise).\n", "\n", "Notes\n", "-----\n", "\n", "When called with only scalars, ``np.block`` is equivalent to an ndarray\n", "call. So ``np.block([[1, 2], [3, 4]])`` is equivalent to\n", "``np.array([[1, 2], [3, 4]])``.\n", "\n", "This function does not enforce that the blocks lie on a fixed grid.\n", "``np.block([[a, b], [c, d]])`` is not restricted to arrays of the form::\n", "\n", " AAAbb\n", " AAAbb\n", " cccDD\n", "\n", "But is also allowed to produce, for some ``a, b, c, d``::\n", "\n", " AAAbb\n", " AAAbb\n", " cDDDD\n", "\n", "Since concatenation happens along the last axis first, `block` is _not_\n", "capable of producing the following directly::\n", "\n", " AAAbb\n", " cccbb\n", " cccDD\n", "\n", "Matlab's \"square bracket stacking\", ``[A, B, ...; p, q, ...]``, is\n", "equivalent to ``np.block([[A, B, ...], [p, q, ...]])``.\n", "\n", "Examples\n", "--------\n", "The most common use of this function is to build a block matrix\n", "\n", ">>> A = np.eye(2) * 2\n", ">>> B = np.eye(3) * 3\n", ">>> np.block([\n", "... [A, np.zeros((2, 3))],\n", "... [np.ones((3, 2)), B ]\n", "... ])\n", "array([[2., 0., 0., 0., 0.],\n", " [0., 2., 0., 0., 0.],\n", " [1., 1., 3., 0., 0.],\n", " [1., 1., 0., 3., 0.],\n", " [1., 1., 0., 0., 3.]])\n", "\n", "With a list of depth 1, `block` can be used as `hstack`\n", "\n", ">>> np.block([1, 2, 3]) # hstack([1, 2, 3])\n", "array([1, 2, 3])\n", "\n", ">>> a = np.array([1, 2, 3])\n", ">>> b = np.array([2, 3, 4])\n", ">>> np.block([a, b, 10]) # hstack([a, b, 10])\n", "array([ 1, 2, 3, 2, 3, 4, 10])\n", "\n", ">>> A = np.ones((2, 2), int)\n", ">>> B = 2 * A\n", ">>> np.block([A, B]) # hstack([A, B])\n", "array([[1, 1, 2, 2],\n", " [1, 1, 2, 2]])\n", "\n", "With a list of depth 2, `block` can be used in place of `vstack`:\n", "\n", ">>> a = np.array([1, 2, 3])\n", ">>> b = np.array([2, 3, 4])\n", ">>> np.block([[a], [b]]) # vstack([a, b])\n", "array([[1, 2, 3],\n", " [2, 3, 4]])\n", "\n", ">>> A = np.ones((2, 2), int)\n", ">>> B = 2 * A\n", ">>> np.block([[A], [B]]) # vstack([A, B])\n", "array([[1, 1],\n", " [1, 1],\n", " [2, 2],\n", " [2, 2]])\n", "\n", "It can also be used in places of `atleast_1d` and `atleast_2d`\n", "\n", ">>> a = np.array(0)\n", ">>> b = np.array([1])\n", ">>> np.block([a]) # atleast_1d(a)\n", "array([0])\n", ">>> np.block([b]) # atleast_1d(b)\n", "array([1])\n", "\n", ">>> np.block([[a]]) # atleast_2d(a)\n", "array([[0]])\n", ">>> np.block([[b]]) # atleast_2d(b)\n", "array([[1]])\n", "\u001b[0;31mFile:\u001b[0m /opt/conda/lib/python3.8/site-packages/numpy/core/shape_base.py\n", "\u001b[0;31mType:\u001b[0m function\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "np.block?\n", "C = np.block([A, b.reshape(-1, 1)]) # reshape to ensure same ndim\n", "C" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "To stack an array along different axes:" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "ExecuteTime": { "end_time": "2020-11-13T05:24:28.765762Z", "start_time": "2020-11-13T05:24:28.755161Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[[0 1 2]\n", " [3 4 5]]] \n", "shape: (1, 2, 3)\n", "[[[0 1 2]\n", " [3 4 5]\n", " [0 1 2]\n", " [3 4 5]]] \n", "shape: (1, 4, 3)\n", "[[[0 1 2]\n", " [3 4 5]]\n", "\n", " [[0 1 2]\n", " [3 4 5]]] \n", "shape: (2, 2, 3)\n", "[[[0 1 2 0 1 2]\n", " [3 4 5 3 4 5]]] \n", "shape: (1, 2, 6)\n", "[[[[0 1 2]\n", " [3 4 5]]]\n", "\n", "\n", " [[[0 1 2]\n", " [3 4 5]]]] \n", "shape: (2, 1, 2, 3)\n" ] } ], "source": [ "array = np.arange(1 * 2 * 3).reshape(1, 2, 3)\n", "for concat_array in [\n", " array,\n", " np.hstack((array, array)), # stack along the first axis\n", " np.vstack((array, array)), # second axis\n", " np.concatenate((array, array), axis=-1), # last axis\n", " np.stack((array, array), axis=0),\n", "]: # new axis\n", " print(concat_array, \"\\nshape:\", concat_array.shape)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**How to perform arithmetic operations on a `numpy` array?**" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "To divide all the coefficients by $2$, we can simply write:" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "ExecuteTime": { "end_time": "2020-11-13T06:01:45.331879Z", "start_time": "2020-11-13T06:01:45.324345Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "array([[1. , 1. , 0.5],\n", " [0. , 1. , 0.5]])" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "D = C / 2\n", "D" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Note that the above does not work for `list`." ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "ExecuteTime": { "end_time": "2020-11-13T06:01:49.982673Z", "start_time": "2020-11-13T06:01:49.969793Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "ename": "TypeError", "evalue": "unsupported operand type(s) for /: 'list' and 'int'", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m/tmp/ipykernel_861/738194356.py\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mC\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtolist\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m/\u001b[0m \u001b[0;36m2\u001b[0m \u001b[0;31m# deep convert to list\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mTypeError\u001b[0m: unsupported operand type(s) for /: 'list' and 'int'" ] } ], "source": [ "C.tolist() / 2 # deep convert to list" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Arithmetic operations on `numpy` arrays apply if the arrays have compatible dimensions. Two dimensions are compatible when\n", "- they are equal, except for\n", "- components equal to 1." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "`numpy` uses [broadcasting rules](https://numpy.org/doc/stable/user/basics.broadcasting.html#general-broadcasting-rules) to stretch the axis of size 1 up to match the corresponding axis in other arrays. \n", "`C / 2` is a example where the second operand $2$ is broadcasted to a $2$-by-$2$ matrix before the elementwise division. Another example is as follows. " ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "ExecuteTime": { "end_time": "2020-11-13T06:00:08.500533Z", "start_time": "2020-11-13T06:00:08.492469Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "[[0]\n", " [1]\n", " [2]]\n", "*\n", "[[0 1 2 3]]\n", "==\n", "[[0 0 0 0]\n", " [0 1 2 3]\n", " [0 2 4 6]]\n", "\n" ] } ], "source": [ "three_by_one = np.arange(3).reshape(3, 1)\n", "one_by_four = np.arange(4).reshape(1, 4)\n", "print(\n", " f\"\"\"\n", "{three_by_one}\n", "*\n", "{one_by_four}\n", "==\n", "{three_by_one * one_by_four}\n", "\"\"\"\n", ")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Next, to subtract the second row of the coefficients from the first row:" ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "ExecuteTime": { "end_time": "2020-11-13T06:01:59.606165Z", "start_time": "2020-11-13T06:01:59.597364Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "array([[1. , 0. , 0. ],\n", " [0. , 1. , 0.5]])" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "D[0, :] = D[0, :] - D[1, :]\n", "D" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Notice the use of commas to index different dimensions instead of using multiple brackets:" ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "ExecuteTime": { "end_time": "2020-11-13T07:21:01.471294Z", "start_time": "2020-11-13T07:21:01.467371Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "assert (D[0][:] == D[0, :]).all()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Using this indexing technique, it is easy to extract the last column as the solution to the system of linear equations:" ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "ExecuteTime": { "end_time": "2020-11-13T06:02:32.769524Z", "start_time": "2020-11-13T06:02:32.762270Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "array([0. , 0.5])" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x = D[:, -1]\n", "x" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "This gives the desired solution $x_0=0$ and $x_1=0.5$ for\n", "\n", "$$ \\begin{aligned}\n", "2x_0 + 2x_1 &= 1\\\\\n", "\\hphantom{2x_0 +} 2x_1 &= 1\\\\\n", "\\end{aligned}$$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "`numpy` provides many [convenient ways](https://numpy.org/doc/stable/reference/arrays.indexing.html#advanced-indexing) to index an array." ] }, { "cell_type": "code", "execution_count": 39, "metadata": { "ExecuteTime": { "end_time": "2020-11-14T00:26:10.075502Z", "start_time": "2020-11-14T00:26:10.069327Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "(array([[0, 1, 2],\n", " [3, 4, 5]]),\n", " array([2, 3]))" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "B = np.arange(2 * 3).reshape(2, 3)\n", "B, B[(0, 1), (2, 0)] # selecting the corners using integer array" ] }, { "cell_type": "code", "execution_count": 40, "metadata": { "ExecuteTime": { "end_time": "2020-11-13T06:03:04.487294Z", "start_time": "2020-11-13T06:03:04.476670Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "(array([[[ 0, 1, 2, 3],\n", " [ 4, 5, 6, 7],\n", " [ 8, 9, 10, 11]],\n", " \n", " [[12, 13, 14, 15],\n", " [16, 17, 18, 19],\n", " [20, 21, 22, 23]]]),\n", " array([[ 0, 1, 2, 3],\n", " [ 4, 5, 6, 7],\n", " [ 8, 9, 10, 11]]),\n", " array([[ 4, 5, 6, 7],\n", " [ 8, 9, 10, 11]]),\n", " array([ 6, 11]),\n", " array([[ 6, 11],\n", " [18, 23]]))" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "B = np.arange(2 * 3 * 4).reshape(2, 3, 4)\n", "B, B[0], B[0, (1, 2)], B[0, (1, 2), (2, 3)], B[\n", " :, (1, 2), (2, 3)\n", "] # pay attention to the last two cases" ] }, { "cell_type": "code", "execution_count": 41, "metadata": { "ExecuteTime": { "end_time": "2020-11-13T06:03:09.699021Z", "start_time": "2020-11-13T06:03:09.690560Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "array([[ 3, 7, 11],\n", " [15, 19, 23]])" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "assert (B[..., -1] == B[:, :, -1]).all()\n", "B[..., -1] # ... expands to selecting all elements of all previous dimensions" ] }, { "cell_type": "code", "execution_count": 42, "metadata": { "ExecuteTime": { "end_time": "2020-11-13T06:03:10.191541Z", "start_time": "2020-11-13T06:03:10.184760Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "array([ 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,\n", " 23])" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "B[B > 5] # indexing using boolean array" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Finally, the following function solves a system of 2 linear equations with 2 variables." ] }, { "cell_type": "code", "execution_count": 43, "metadata": { "ExecuteTime": { "end_time": "2020-11-13T13:42:20.247584Z", "start_time": "2020-11-13T13:42:20.241223Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "def solve_2_by_2_system(A, b):\n", " \"\"\"Returns the unique solution of the linear system, if exists,\n", " else returns None.\"\"\"\n", " C = np.hstack((A, b.reshape(-1, 1)))\n", " if C[0, 0] == 0:\n", " C = C[(1, 0), :]\n", " if C[0, 0] == 0:\n", " return None\n", " C[0, :] = C[0, :] / C[0, 0]\n", " C[1, :] = C[1, :] - C[0, :] * C[1, 0]\n", " if C[1, 1] == 0:\n", " return None\n", " C[1, :] = C[1, :] / C[1, 1]\n", " C[0, :] = C[0, :] - C[1, :] * C[0, 1]\n", " return C[:, -1]" ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "ExecuteTime": { "end_time": "2020-11-13T13:44:31.275619Z", "start_time": "2020-11-13T13:44:31.267479Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "A=[[1. 0.]\n", " [0. 1.]]\n", "b=[1. 1.]\n", "x=[1. 1.]\n", "\n", "A=[[1. 1.]\n", " [1. 1.]]\n", "b=[1. 1.]\n", "x=None\n", "\n", "A=[[1. 1.]\n", " [0. 0.]]\n", "b=[1. 1.]\n", "x=None\n", "\n", "A=[[1. 0.]\n", " [1. 0.]]\n", "b=[1. 1.]\n", "x=None\n", "\n" ] } ], "source": [ "# tests\n", "for A in (\n", " np.eye(2),\n", " np.ones((2, 2)),\n", " np.stack((np.ones(2), np.zeros(2))),\n", " np.stack((np.ones(2), np.zeros(2)), axis=1),\n", "):\n", " print(f\"A={A}\\nb={b}\\nx={solve_2_by_2_system(A,b)}\\n\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Universal functions" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Why does the first line of code below return two arrays but the second code return only one array? Shouldn't the first line of code return the following?\n", "```Python\n", "array([[(0,1), (0,2), (0,3)],\n", " [(1,1), (1,2), (1,3)]])\n", "```" ] }, { "cell_type": "code", "execution_count": 45, "metadata": { "ExecuteTime": { "end_time": "2020-11-14T00:34:59.921029Z", "start_time": "2020-11-14T00:34:59.914787Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(array([[0, 0, 0],\n", " [1, 1, 1]]), array([[0, 1, 2],\n", " [0, 1, 2]]))\n", "[[0 0 0]\n", " [0 1 2]]\n" ] } ], "source": [ "print(np.fromfunction(lambda i, j: (i, j), (2, 3), dtype=int))\n", "print(np.fromfunction(lambda i, j: (i * j), (2, 3), dtype=int))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" } }, "source": [ "From the documentation, `fromfunction` applies the given function to the two arrays as arguments.\n", "- The first line of code returns a tuple of the arrays.\n", "- The second line of code multiplies the two arrays to give one array, according to how multiplication works for numpy arrays." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Indeed, `numpy` implements [universal/vectorized functions/operators](https://numpy.org/doc/stable/reference/ufuncs.html) that take arrays as arguments and perform operations with appropriate broadcasting rules. The following is an example that uses the universal function `np.sin`:" ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "ExecuteTime": { "end_time": "2020-11-14T00:46:58.019048Z", "start_time": "2020-11-14T00:46:57.697504Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "3c3d8dbb2c734db79064c52d86279974", "version_major": 2, "version_minor": 0 }, "text/plain": [ "interactive(children=(IntSlider(value=1, description='a', max=5), FloatSlider(value=0.0, description='b', max=…" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import matplotlib.pyplot as plt\n", "\n", "\n", "@widgets.interact(a=(0, 5, 1), b=(-1, 1, 0.1))\n", "def plot_sin(a=1, b=0):\n", " x = np.linspace(0, 2 * math.pi)\n", " plt.plot(x, np.sin(a * x + b * math.pi)) # np.sin, *, + are universal functions\n", " plt.title(r\"$\\sin(ax+b\\pi)$\")\n", " plt.xlabel(r\"$x$ (radian)\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "In addition to making the code shorter, universal functions are both efficient and flexible. (Recall the Monte Carlo simulation to approximate $\\pi$.)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "**Exercise** Explain how the Monte Carlo simulation work using universal functions:\n", "```Python\n", "def np_approximate_pi(n):\n", " in_circle = (np.random.random((n,2))**2).sum(axis=-1) < 1\n", " mean = 4 * in_circle.mean()\n", " std = 4 * in_circle.std() / n**0.5\n", " return np.array([mean - 2*std, mean + 2*std])\n", "```" ] }, { "cell_type": "markdown", "metadata": { "nbgrader": { "grade": true, "grade_id": "universal", "locked": false, "points": 0, "schema_version": 3, "solution": true, "task": false } }, "source": [ "- `random.random` generates a numpy array for $n$ points in the unit square randomly.\n", "- `sum` sums up the element along the last axis to give the squared distance.\n", "- `<` returns the boolean array indicating whether each point is in the first quadrant of the inscribed circle.\n", "- `mean` and `std` returns the mean and standard deviation of the boolean array with True and False interpreted as 1 and 0 respectively." ] } ], "metadata": { "celltoolbar": "Create Assignment", "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.12" }, "latex_envs": { "LaTeX_envs_menu_present": true, "autoclose": false, "autocomplete": true, "bibliofile": "biblio.bib", "cite_by": "apalike", "current_citInitial": 1, "eqLabelWithNumbers": true, "eqNumInitial": 1, "hotkeys": { "equation": "Ctrl-E", "itemize": "Ctrl-I" }, "labels_anchors": false, "latex_user_defs": false, "report_style_numbering": false, "user_envs_cfg": false }, "rise": { "enable_chalkboard": true, "scroll": true, "theme": "white" }, "toc": { "base_numbering": 1, "nav_menu": { "height": "195px", "width": "330px" }, "number_sections": true, "sideBar": true, "skip_h1_title": true, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": { "height": "454.418px", "left": "1533px", "top": "110.284px", "width": "261px" }, "toc_section_display": true, "toc_window_display": false }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 4 }